Skip to content

Conversation

@julianguinard
Copy link
Contributor

@julianguinard julianguinard commented Feb 24, 2025

TL:DR

Add the ability for metrics-api scaler to get metrics from all endpoint targets of a kubernetes API service (aggregateFromKubeServiceEndpoints: true metadata) and aggregate them using average, sum, min or max aggregation functions (aggregationType metadata), which is a handy feature in an environment where one didn't set up a metric aggregator/scraping stack (i.e prometheus), or simply doesn't want to use their monitoring stack to fetch and serve metrics from customers workload in their own kubernetes clusters, and leave the metrics API's reponsability up to the customer

This PR comes from an issue met when metrics-api scaler targets an internal kubernetes service, such as the one from metrics_api E2E tests but, unlike it, has more than 1 replica serving it, leading to inconsistent HPA average calculation

The more a kubernetes service used by metrics API scaler has pods as targets, the less likely it is for the metrics API scaler to get all metrics from all these pods, leading to inconsistent HPA average metric computation and eventually to scaling issues. Especially in configurations where kubernetes services have "sticky" configurations

Below are screenshots of the modified E2E metrics_api_test.go, commenting out aggregateFromKubeServiceEndpoints: true addition to metadata, being executed & failing because scale in never occurs as every metric from all 10 replicas behind metrics server deployment are not all well taken into account

Screenshot from 2025-02-24 17-06-35

logs from keda-operator show same value being returned to metrics-api scaler by the kubernetes service (16)
Screenshot from 2025-02-24 16-18-07

As a consequence, test eventually fails because scaled deployment is unable to scale out within 3 minutes timeframe
Screenshot from 2025-02-24 16-19-00

Uncommenting aggregateFromKubeServiceEndpoints: true addition to metadata results in metric aggregation mode and same test passes this time

Screenshot from 2025-02-24 16-22-14
Screenshot from 2025-02-24 16-22-07

Checklist

Helm PR : kedacore/charts#739
Docs PR : kedacore/keda-docs#1541

@julianguinard julianguinard force-pushed the add-metrics-api-aggregation-from-kube-service-feature branch 2 times, most recently from 742e19d to be170fd Compare February 25, 2025 09:57
@julianguinard julianguinard marked this pull request as ready for review February 25, 2025 10:31
@julianguinard julianguinard requested a review from a team as a code owner February 25, 2025 10:31
@julianguinard julianguinard changed the title Add aggregation from kube service endpoints feature in metrics API scaler Feat : Add aggregation from kube service endpoints feature in metrics API scaler Feb 25, 2025
@zroubalik
Copy link
Member

zroubalik commented Mar 5, 2025

/run-e2e metrics_api
Update: You can check the progress here

@julianguinard
Copy link
Contributor Author

julianguinard commented Mar 6, 2025

/run-e2e metrics_api Update: You can check the progress here

@zroubalik thanks for triggering E2E, which ran fine here

@julianguinard julianguinard force-pushed the add-metrics-api-aggregation-from-kube-service-feature branch 2 times, most recently from 172e504 to 61ebd1f Compare March 11, 2025 15:02
@julianguinard
Copy link
Contributor Author

@zroubalik FYI I added this commit to make E2E more resilient on clusters where launching all metrics servers replicas take a longer time than expected to start : we now sleep 1 second & try 4 more times to validate expected pods before failing

It also fixes formatting of each endpoint URL to reach for aggregating metrics (double "/" in path, did not trigger errors in E2E because server ran by ghcr.io/kedacore/tests-metrics-api seems to cope with it anyway)

@JorTurFer
Copy link
Member

JorTurFer commented Mar 14, 2025

/run-e2e metrics_api
Update: You can check the progress here

@julianguinard
Copy link
Contributor Author

/run-e2e metrics_api Update: You can check the progress here

@JorTurFer thanks for E2E trigger which all ran fine

waiting for reviews & opinions on this now :)

@julianguinard julianguinard requested a review from JorTurFer March 19, 2025 10:31
julianguinard added a commit to julianguinard/keda that referenced this pull request Mar 24, 2025
…rics from all kubernetes service endpoints fail for whatever reason

see kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>
julianguinard added a commit to julianguinard/keda that referenced this pull request Mar 24, 2025
…rics from all kubernetes service endpoints fail for whatever reason

see kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>
@JorTurFer
Copy link
Member

JorTurFer commented Mar 24, 2025

/run-e2e metrics_api
Update: You can check the progress here

@julianguinard julianguinard force-pushed the add-metrics-api-aggregation-from-kube-service-feature branch from 1c33197 to 25f9b0d Compare March 24, 2025 10:37
julianguinard added a commit to julianguinard/keda that referenced this pull request Mar 24, 2025
…rics from all kubernetes service endpoints fail for whatever reason

see kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>
julianguinard added a commit to julianguinard/keda that referenced this pull request Mar 24, 2025
…rics from all kubernetes service endpoints fail for whatever reason

see kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>
@julianguinard julianguinard force-pushed the add-metrics-api-aggregation-from-kube-service-feature branch from 215053a to 908f8c3 Compare March 24, 2025 10:43
julianguinard added a commit to julianguinard/keda that referenced this pull request Mar 24, 2025
…rics from all kubernetes service endpoints fail for whatever reason

see kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>
@julianguinard julianguinard force-pushed the add-metrics-api-aggregation-from-kube-service-feature branch from 908f8c3 to c6ca0e4 Compare March 24, 2025 10:44
julianguinard added a commit to julianguinard/keda that referenced this pull request Mar 24, 2025
…rics from all kubernetes service endpoints fail for whatever reason

see kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>
@JorTurFer
Copy link
Member

JorTurFer commented Mar 24, 2025

/run-e2e metrics_api
Update: You can check the progress here

Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! I've kept some implementation ideas inline

@keda-automation keda-automation requested review from a team October 1, 2025 07:40
@julianguinard
Copy link
Contributor Author

Repushed for changelog conflict solving

@rickbrouwer
Copy link
Member

rickbrouwer commented Oct 1, 2025

/run-e2e metrics
Update: You can check the progress here

Copy link
Member

@wozniakjan wozniakjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks super usefull, few items regarding EndpointSlice usage below

julianguinard added a commit to julianguinard/keda that referenced this pull request Oct 8, 2025
@julianguinard julianguinard force-pushed the add-metrics-api-aggregation-from-kube-service-feature branch from d0497e5 to 5bc7805 Compare October 8, 2025 07:51
@keda-automation keda-automation requested a review from a team October 8, 2025 07:51
@julianguinard
Copy link
Contributor Author

this looks super usefull, few items regarding EndpointSlice usage below

Hello @wozniakjan and thanks for your feedback ❤️
I adapted EndpointSlice usage according to tour suggestions. In the iteration, we now discard any unready endpoint or endpoint adresses which have already been registered
feel free to review it again!

@keda-automation keda-automation requested a review from a team October 8, 2025 12:05
julianguinard and others added 3 commits October 8, 2025 14:26
…scaler

- make generate-scalers-schema

Signed-off-by: julian GUINARD <[email protected]>
Co-authored-by: Jan Wozniak <[email protected]>
Signed-off-by: julianguinard <[email protected]>
@julianguinard julianguinard force-pushed the add-metrics-api-aggregation-from-kube-service-feature branch from 881798c to 0a5d42c Compare October 8, 2025 12:27
@rickbrouwer
Copy link
Member

rickbrouwer commented Oct 20, 2025

/run-e2e metrics_api
Update: You can check the progress here

@julianguinard
Copy link
Contributor Author

julianguinard commented Oct 21, 2025

E2E all green. @wozniakjan feel free to have a third look at it whenever you wish :)

@wozniakjan wozniakjan merged commit 8471b65 into kedacore:main Oct 21, 2025
24 checks passed
@julianguinard julianguinard deleted the add-metrics-api-aggregation-from-kube-service-feature branch October 22, 2025 09:08
@julianguinard
Copy link
Contributor Author

Thanks for the mege @wozniakjan !
we may also merge twin doc & helm chart PRs
kedacore/keda-docs#1617
kedacore/charts#739

alt-dima pushed a commit to alt-dima/keda that referenced this pull request Dec 13, 2025
… API scaler (kedacore#6565)

* - Add aggregation from kube service endpoints feature in metrics API scaler
- make generate-scalers-schema

Signed-off-by: julian GUINARD <[email protected]>

* apply suggestions :
- kedacore#6565 (comment)
- kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>

* Update pkg/scalers/metrics_api_scaler.go

Co-authored-by: Jan Wozniak <[email protected]>
Signed-off-by: julianguinard <[email protected]>

---------

Signed-off-by: julian GUINARD <[email protected]>
Signed-off-by: julianguinard <[email protected]>
Co-authored-by: Jan Wozniak <[email protected]>
Signed-off-by: Dmitriy Altuhov <[email protected]>
tangobango5 pushed a commit to tangobango5/keda that referenced this pull request Dec 22, 2025
… API scaler (kedacore#6565)

* - Add aggregation from kube service endpoints feature in metrics API scaler
- make generate-scalers-schema

Signed-off-by: julian GUINARD <[email protected]>

* apply suggestions :
- kedacore#6565 (comment)
- kedacore#6565 (comment)

Signed-off-by: julian GUINARD <[email protected]>

* Update pkg/scalers/metrics_api_scaler.go

Co-authored-by: Jan Wozniak <[email protected]>
Signed-off-by: julianguinard <[email protected]>

---------

Signed-off-by: julian GUINARD <[email protected]>
Signed-off-by: julianguinard <[email protected]>
Co-authored-by: Jan Wozniak <[email protected]>
julianguinard added a commit to julianguinard/keda that referenced this pull request Jan 5, 2026
the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Signed-off-by: julian GUINARD <[email protected]>
julianguinard added a commit to julianguinard/keda that referenced this pull request Jan 5, 2026
the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Signed-off-by: julian GUINARD <[email protected]>
julianguinard added a commit to julianguinard/keda that referenced this pull request Jan 5, 2026
the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Signed-off-by: julian GUINARD <[email protected]>
wozniakjan pushed a commit that referenced this pull request Jan 6, 2026
the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Signed-off-by: julian GUINARD <[email protected]>
alt-dima pushed a commit to alt-dima/keda that referenced this pull request Jan 9, 2026
the final semaphore acquisition was still missing, which resulted in loop not waiting until all metrics were gathered from each metrics API server

Signed-off-by: julian GUINARD <[email protected]>
Signed-off-by: Dima Altukhov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants